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INTRODUCTION 

The objective of our program is to develop and validate a procedure for ocean color data merging which is 
one of the major goals of the SIMBIOS project (McClain et al., 1995). The need for a merging capability is 
dictated by the fact that since the launch of MODIS on the Terra platform and over the next decade, several 
global ocean color missions from various space agencies are or will be operational simultaneously. The 
apparent redundancy in simultaneous ocean color missions can actually be exploited to various benefits. 
The most obvious benefit is improved coverage (Gregg et al., 1998; Gregg & Woodward, 1998). The 
patchy and uneven daily coverage from any single sensor can be improved by using a combination of 
sensors. Beside improved coverage of the global ocean the merging of ocean color data should also result 
in new, improved, more diverse and better data products with lower uncertainties. Ultimately, ocean color 
data merging should result in the development of a unified, scientific quality, ocean color time series, from 
SeaWiFS to NPOESS and beyond. 

Various approaches can be used for ocean color data merging and several have been tested within the frame 
of the SIMBIOS program (see e.g. Kwiatkowska & Fargion, 2003, Franz et al., 2003). As part of the 
SIMBIOS Program, we have developed a merging method for ocean color data. Conversely to other 
methods our approach does not combine end-products like the subsurface chlorophyll concentration (chi) 
from different sensors to generate a unified product. Instead, our procedure uses the normalized water- 
leaving radiances (LwnW) from single or multiple sensors and uses them in the inversion of a semi- 
analytical ocean color model that allows the retrieval of several ocean color variables simultaneously. 
Beside ensuring simultaneity and consistency of the retrievals (all products are derived from a single 
algorithm), this model-based approach has various benefits over techniques that blend end-products (e.g. 
chlorophyll): 1) it works with single or multiple data sources regardless of their specific bands, 2) it 
exploits band redundancies and band differences, 3) it accounts for uncertainties in the L wN (X) data and, 4) 
it provides uncertainty estimates for the retrieved variables. 

RESEARCH ACTIVITIES 

1) Development of an ocean color database and algorithm development 

Over the past 3 years, we have assembled a large comprehensive in situ ocean color data set that contains 
inherent (IOP) and apparent (AOP) optical properties as well as chlorophyll a concentration data from 
various locations. This database is designed for ocean color algorithm development and is well suited for 
semi-analytical algorithm development in particular. Since it contains both IOPs and AOPs, this database is 
better suited for semi-analytical algorithms development than data sets like the one used during SeaBAM 
(O'Reilly et al., 1998) which only contained chi and remote sensing reflectance data. Most of the data 
included in the database come from the NASA SIMBIOS SeaBASS archive but several investigators have 


provided data sets or subsets directly to us. Various quality control (QC) procedures have been developed 
(Fargion & McClain, 2003) to identify corrupted data, outliers or specific bio-optical situations. The 
database contains chlorophyll a concentration, diffuse attenuation coefficients, Lw N (k), particulate 
backscattering and component absorption (phytoplankton, detrital, dissolved). Most of the absorption data 
are hyperspectral. The current status of the IOP/AOP data set is described in Table 1. 


Table 1 . Status of the AOP/IOP data set. The first number in each cell indicates the number of stations for 
which data are available. The numbers in parentheses indicate the number of available wavelengths. 
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modifications we are trying to implement. A preliminary version of the hyperspectral model has been 
developed and optimized using data from the AOP/IOP database. Although it shows some good overall 
results for all three retrieved variables this preliminary hyperspectral version does not always perform well 
at the extremes of the chlorophyll range (either in very clear or very rich waters). This new version of the 
model still requires some work and a more conservative, step-by-step approach is now used for its 
development. 


2) Satellite Ocean Color Data Merging 

Our approach for ocean color data merging is based on the inversion of a semi-analytical model that relates 
L wN to the backscattering and absorption coefficients (Eq. 1) as described in Gordon et al. (1988). 


Lwn(A) = 


_tF 0 (A) 




b bw (A) + b bp (A) 


b bw (A) + b bp (A) + a w (A) + a ph (A) + a^ (A) 


( 1 ) 


Each of the non-water components in a and bb is expressed as a known shape function with an unknown 
magnitude: 


ap h (X) = Chi ap h *(X) 

(2) 

acdnA) = a cdm (443) exp(-S(X-443)) 

(3) 

bbpCX) = b bp (443) (X/443)-* 

(4) 


where t, n w , gj, F 0 (X), a^X) and bb W (X) at taken from the literature whereas q, S, ap h *(X) were determined by 
“tuning” the model against a large in situ data set (Maritorena et al., 2002). A non-linear least-square fitting 
technique is used to solve for the unknowns (chi, 8^(443) and b bp (443)) from L wN (X) data at 4 or more 
wavelengths. The model also provides uncertainty estimates for each of the retrievals using a linear 
approximation to the calculation of non-linear regression inference regions (Bates & Watts, 1988). The 
model, hereafter referred to as the GSM01 model, is fully described in Garver & Siegel (1997) and 
Maritorena et al. (2002). 

Since the model retrievals are generated using a curve-fitting technique that minimizes the least-squares 
differences between the measured and modeled Lw N (X), it is straightforward to use the GSM01 model with 
multiple data sets like in data merging. When multiple data sets are used (e.g. two or more sensors have 
Lw N measurements over a given pixel), the Lwn(X) data from all available sources are concatenated (as are 
the relevant wavebands information) allowing the curve-fitting step to be conducted with more data points 
than with a single source. When data sources have different bands, the fitting procedure also benefits from 
an increased spectral resolution. A key aspect of the procedure is that there is no transformation or 
averaging of the input Lw N data, they are used "as is" in the curve-fitting technique. A schematic of the 
input and output products of the merging model is presented in figure 1 . 

Our model-based approach for ocean color data merging was first tested using SeaWiFS and MOS data and 
results were presented during the SIMBIOS science Team meeting in Baltimore (Jan. 15-17, 2002). Since 
then, we have successfully used SeaWiFS and MODIS data (from both the Terra and Aqua platforms). We 
have tested our merging approach using daily level-3 data from SeaWiFS (reprocessing #4, 9 km) and 
MODIS (collection #4, 4.6 km) for 18 dates between December 4, 2000 and March 22, 2003. Only the 
MODIS “best quality” data (i.e. quality 0) were used during these tests. Since the SeaWiFS and MODIS 
Lwm( 1) data products have different spatial resolution, it is necessary to first adapt the MODIS data to the 
SeaWiFS resolution by averaging four 4.6 km bins into a 9 km bin and to have the 2 data sources set to a 
common binned grid. The data are processed between 65 degrees North and 65 degrees South. We have 
focused most of our effort on the period for which MODIS Terra collection #4 products are validated 
(11/1/2000-3/19/2002). Some data outside this time window were also used to illustrate 
SeaWiFS/Terra/Aqua data merging and to assess improvement in coverage when merging 3 different ocean 
color data sources. 


RESEARCH RESULTS 


The aim of our SIMBIOS work is to demonstrate the feasibility of an ocean color data merging procedure 
based on a semi-analytical mode that uses L wN (k) data from one or more sources. It is out of the scope of 
this report to document the accuracy of the model retrievals with in situ or satellite data (but see Maritorena 
et al., 2002; Siegel et al., 2002 and Siegel et al., 2003. Examples of global maps of chlorophyll a 
concentration, acd m (443) and bbp(443) generated by the GSM01 merging model with SeaWiFS and MODIS- 
Terra Lwn(^) data are presented in figure 2. Global maps of chi, acdm(443) and the bb P (443) images 
generated by the GSM01 merging model using the Terra and SeaWiFS data show very good consistency 
overall (Maritorena et al., 2003). In general, the retrieved fields do not show discontinuities when the 
model switched from an area with a single data source to an area where both SeaWiFS and MODIS Lw N (l) 
data were used. This reflects the generally good agreement between the L wN (l) data from both sources and 
the robustness of the model. The level of agreement between the two sensors also has an influence on the 
estimated uncertainties of the derived products. When considering the pixels that are covered by both 
sensors, a very large majority of them show reduced uncertainties in the merged products compared to 
those generated from a single data source. Figure 3 shows the frequency distribution of the ratio of the Chi 
uncertainties using either SeaWiFS or MODIS L^n data alone over the Chi uncertainties when both sources 
are used. Overall, the uncertainties tend to decrease when multiple sources are used and this is true for all 3 
products generated by the GSM01 model. This decrease in the uncertainties of the derived products is 
observed for all products and at all the dates we have processed. In the worst case, 70% of the pixels 
showed lower uncertainties in the merged products. Uncertainties are generally higher in the merged 
products when L*n data do not agree well between the data sources. This is illustrated in figure 4 where 
Lw N ( 443) from both sensors are compared when uncertainties improved in the merged product 
(merged/unmerged uncertainty ratio of 0.5 or less) and when they got worse (uncertainty ratio > 1). When 
the uncertainties are lower in the merged product, the agreement between both sets of L wN (X) is generally 
very good with most of the data points on or very close to the 1:1 line (figure 4 shows the 443 data but this 
is true for all bands). When the uncertainties were higher in the merged images, the Lw N (X) data showed 
some clear differences between SeaWiFS and MODIS. Areas where the products uncertainties are actually 
higher in the merged image are frequently next to gaps caused by sun glint, gaps between the swaths or 
clouds. They also appear to be mostly located in the south hemisphere. Further analyses are needed to 
assess what causes these features. 

Improvement in coverage is obvious. The daily surface area effectively covered by any individual sensor 
depends upon various factors such as the sensor’s technical and orbital characteristics, sun glint, cloud 
cover and season. The increased coverage that results from the use of multiple data sources is illustrated in 
figure 5 for the 18 dates we have used. Daily coverage jumps from -12-15% of the ocean surface (in the 
65N to 65S range) when SeaWiFS is used alone to -25% when it is used with MODIS-Terra. When 
MODIS-Aqua data are used in the merging process along with SeaWiFS and Terra the daily percentage 
coverage reaches -30-35% to the ocean surface. These numbers agree well with those derived from a 
theoretical analysis prior to SeaWiFS and MODIS launches (Gregg and Woodward, 1998) as well as with 
those obtained by the SIMBIOS Project with their Level-3 merged chlorophyll product (Franz et al.). 

The absence of marked discontinuities in the mapped product and the lower products uncertainties are 
two important results. At this point, it should be mentioned that some of the features of our merging 
approach cannot be used to their full efficiency mostly because the MODIS data are not fully characterized 
or stabilized. For example, the direct use of all available L wN data in the fitting procedure assumes they 
agree well and have similar or close uncertainty levels and that none of the data sources contains noticeable 
bias. This may not be always true. Although not used in the results presented here, the merging model has 
the ability of weighting each individual Lwn(^) data to insure that the best observations are given a higher 
weight in the fitting procedure that leads to the derivation of the retrievals. Uncertainties (Oi(X.j)) of input 
Lwn M can be accounted for in the the least squares minimization (LSM) procedure as 

^ wN-i(^j)mod. ~ ^ wN-j(^j)meas. 

tfj(Aj) 


N„ N* 

LSM = 22 

i=l j=l 


( 5 ) 


where is the number of data source and NX. is the number of bands for each source. This has not 
yet been used mostly because the uncertainties associated with the MODIS bands of Terra and Aqua cannot 
yet be fully assessed in time and space. This requires matchup analyses from a large and diverse set of in 
situ and satellite data. These analyses are available for SeaWiFS but more matchup points are needed to 
complete the analysis for the MODIS data. It is also necessary to have some knowledge of the uncertainties 
variability in space and time. Once the characterization of Terra and Aqua is detailed enough, it should be 
possible to implement uncertainty weighting functions. A consistent BRDF correction scheme for the 
sensors involved in the data merging would also represent an improvement and upcoming reprocessings of 
SeaWiFS and MODIS data should take care of that particular aspect. 

In this project, we have demonstrated that ocean color data merging can effectively be conducted 
using Lw N data and the inversion of a semi-analytical algorithm. The method can be applied 
straightforwardly to any suite of ocean color sensors. Beside the feasibility aspect, the improvement in 
daily coverage and the lower uncertainties in the merged products are two important results of our work. 
Some refinements (e.g. weighting functions based on the uncertainty levels of the input L wN (X) data, BRDF 
correction) can be added to the current approach in the future when some of the satellite data will be more 
mature 
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Figure 1. Schematic of the input and output data of the semi-analytical ocean color merging model. 









Figure 2. Example of daily images (December 4, 2000) of chi (upper panel), ac dm (443) (center panel) and 
bb P (443) (lower panel) generated by the GSM01 merging model using daily level-3 Lwn(^) data from 
SeaWiFS and MODIS-Terra. 
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Figure 3. Frequency distribution of the ratio of the chi uncertainties using either SeaWiFS or MODIS-Terra 
L wN data alone (December 4, 2000) over the chi uncertainties when both sources are used (for the same 
pixels). Ratios < 1 show an improvement in the uncertainties of the merged products. 
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Figure 4. Comparison of Lw N (443) data from SeaWiFS and MODIS-Terra data when uncertainties 
improved in the merged chi product (merged/unmerged uncertainty ratio of 0.5 or less, upper panels) and 
when they got worse (uncertainty ratio > 1, lower panels). 







Figure 5. Daily coverage resulting from the merging of SeaWiFS, MODIS-Terra and MODIS-Aqua for the 
1 8 dates used in this study. The coverage is computed as the percentage of the total ocean area (between 65 
degrees North and 65 degrees South). 



